Attentive Representation Learning With Adversarial Training for Short Text Clustering
نویسندگان
چکیده
Short text clustering has far-reaching effects on semantic analysis, showing its importance for multiple applications such as corpus summarization and information retrieval. However, it inevitably encounters the severe sparsity of short representations, making previous approaches still far from satisfactory. In this paper, we present a novel attentive representation learning model shot clustering, wherein cluster-level attention is proposed to capture correlations between representations cluster representations. Relying this, texts are seamlessly integrated into unified model. To further ensure robust training texts, apply adversarial unsupervised setting, by injecting perturbations The parameters optimized alternately through minimax game. Extensive experiments four real-world datasets demonstrate superiority over several strong competitors, verifying that yields substantial performance gains.
منابع مشابه
Semi-supervised Clustering for Short Text via Deep Representation Learning
In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering. We design a novel objective to combine the representation learning process and the kmeans clustering process together, and optimize the objective with both labeled data a...
متن کاملImage-Text Multi-Modal Representation Learning by Adversarial Backpropagation
We present novel method for image-text multi-modal representation learning. In our knowledge, this work is the first approach of applying adversarial learning concept to multi-modal learning and not exploiting image-text pair information to learn multi-modal feature. We only use category information in contrast with most previous methods using image-text pair information for multi-modal embeddi...
متن کاملGenerating Text via Adversarial Training
Generative Adversarial Networks (GANs) have achieved great success in generating 1 realistic synthetic real-valued data. However, the discrete output of language model 2 hinders the application of gradient-based GANs. In this paper we propose a generic 3 framework employing Long short-term Memory (LSTM) and convolutional neural 4 network (CNN) for adversarial training to generate realistic text...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملAdversarial Multi-task Learning for Text Classification
Neural network models have shown their promising opportunities for multi-task learning, which focus on learning the shared layers to extract the common and task-invariant features. However, in most existing approaches, the extracted shared features are prone to be contaminated by task-specific features or the noise brought by other tasks. In this paper, we propose an adversarial multi-task lear...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2022
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2021.3052244